Table partitioning

Breaking the table logically into smaller segments based on some criteria, i.e. the column which we know is going to get used very frequently in where clause.

Partitioning makes the table easy to manage, performance is enhanced.

Advantages

  • Improves query performance and is cost-effective
  • Increases parallelism
  • Independent partitions can be managed indipendently
  • Different partitions can live on different filesystems: less frequently accessed data can be placed on slower disks
  • Useful while doing upserts on bulk table (for daily incremental loads in datawarehouse)

Bulk data example

We have a huge target table in the datawarehouse. It's not partitioned.

The source table is some transactional database containing monthly data and gets refreshed daily with increments.

"alt"

Solution

Partitioning target table on month. And daily switching the partition with a new one from source table.

This takes less time "alt"

Creating partitioned table

When creating a table, in Partition and cluster settings, choose between:

Require partitioning filter (optional)

Users are mandate to include a WHERE clause on the partitioning column qhile they query the table.

This implies that user is forced to take full use of partitioning while the table is partitioned. It may reduce cost and improve performances

Copy and paste

"alt"

DML ops

Same as normal but adding a fltering on the partition